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We study two player reachability-price games on single-clock timed automata. The problem is as 
follows: given a state of the automaton, determine whether the first player can guarantee reaching 
one of the designated goal locations. If a goal location can be reached then we also want to compute 
the optimum price of doing so. Our contribution is twofold. First, we develop a theory of cost 
functions, which provide a comprehensive methodology for the analysis of this problem. This theory 
allows us to establish our second contribution, an EXPTIME algorithm for computing the optimum 
reachabiUty price, which improves the existing 3EXPTIME upper bound. 

1 Introduction 

Timed automata QJ are a formalism used for modeling real time systems, i.e., systems whose behavior 
depends on time. Timed automata are finite automata augmented with a set of clocks. The values of 
the clocks grow uniformly over time. There are two types of transitions: continuous, resulting in time 
progression, and discrete, resulting in a change of location. Discrete transitions may reset values of 
certain clocks to zero, and different transitions may be enabled at different clock values. 

Optimal schedule synthesis is one of the key areas of research in timed automata theory I?! HI |9l \TT\ 
im . In this setting, timed automata are augmented with pricing information, and each execution of the 
automaton is assigned a payoff. Moreover, we want to model lack of full control over the system; game 
theory is commonly used in this context ||5j|9l[T0l[T2j[TT|. There are two players: the minimizer and the 
maximizeiQ who have opposite goals of minimizing and maximizing the payoff of a play, respectively. 
In this case a play is an execution of the automaton, and we are dealing with the worst case scenario, 
where the controller is interacting with an adversarial environment. In this context, reachability-price 
games are commonly considered |i6l|9l[ITl. In these games the goal is to optimize the accumulated price 
of reaching a designated set of states. 

When the payoff of an execution is simply its time duration, synthesizing an almost-optimal sched- 
ule is EXPTIME-complete fTV\. In linearly priced timed automata IS 13, the price of an individual 
continuous transition is its duration multiplied by a location specific price rate. Bouyer et al. show that 
determining the existence of an optimal schedule for linearly priced timed automata, with at least three 
clocks is undecidable [6]. On the other hand, Bouyer et al. show a triply exponential algorithm for 
single-clock linearly priced timed automata (9). However, the exact complexity of the problem is still 
unknown, as PTIME is the best lower bound that is currently known. 

Contributions. In this paper, we present a new EXPTIME algorithm for optimal schedule synthesis 
for linearly priced single-clock timed automata with non-negative price rates. Our work improves the 
triply exponential algorithm given by Bouyer et al. ||3. 



In the literature, these players are often referred to as the controller and the environment. 
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Our contribution is twofold. First, in order to deliver the main result, we establish technical results 
regarding cost functions, i.e., piecewise affine continuous functions that are non-increasing. Cost func- 
tions in this form were first considered by Bouyer et al. [9]. They are central to both algorithms, the one 
presented in this paper, and that of Bouyer et al. [9], as both algorithms use them to produce their output. 
The output of the algorithms is a function that assigns the optimal price of reachability to each state; this 
output function can be represented by a finite set of cost functions. Our technical results regard opera- 
tions performed on cost functions during the execution of the algorithm. We establish the properties and 
invariants of these operations, which later allows us to analyze the complexity, and prove the correctness 
of the algorithm. This understanding of cost functions was pivotal in achieving the doubly-exponential 
speedup, and we believe that this detailed analysis might prove useful in further closing the existing 
complexity gap. 

Second, we show an EXPTIME algorithm for computing the optimal price of reachability. As in 
Bouyer's et al. approach H, the algorithm does the computation through a recursive procedure, with 
respect to the number of locations of the timed automaton. Again, as in Bouyer's et al. work f9l, in each 
recursive call we single out the location that minimizes the price rate. In the model, locations are assigned 
to players, which necessitates different handling of a location, depending on its ownership. In the case 
of the maximizer locations, our algorithm behaves exactly like the original, however, when it comes to 
handling minimizer locations, we improve over the predecessor. The original algorithm would proceed to 
recursively solve two subproblems, which resulted in an additional exponential blowup. The algorithm 
presented in this paper, as in the case of maximizer locations, employs an iterative procedure which 
prevents this blowup. The approach is similar in spirit to that used in handling maximizer locations, 
however, the details are different. 

Tools for handling games, where the price of an execution is its time duration, already exist (e.g., 
UPPAAL IEJ). We believe that the work presented in this paper may help in the development of such 
tools for reachability-price games on linearly priced timed automata. 



2 Preliminaries 

Cost functions. Below we introduce the notion of a cost function, and prove some of its basic prop- 
erties. Cost functions are a central notion when considering reachability-price games on single-clock 
timed automata. The theory of cost functions will be used to construct the algorithm for computing the 
optimal reachability cost, as well as to prove its correctness. 

In this paper, we will be dealing with the ordered set of real numbers augmented with the greatest 
element, positive infinity. For that purpose we need to extend the +, min and max operators in a natural 
way. For a G M U {°°} we have a + oo = oo, max(fl;, oo) = oo and min(a, oo) = a. 

Definition 1 (Cost function) A function f : I ^^ MU {oo} that is continuous, non-increasing, and piece- 
wise affine, where I is a bounded interval, is said to be a cost function. We will write '^^{I) C [/ — ;• 
MU {oo}] to denote the set of all cost functions with the domain I. 

Remark 2 Notice that if f G 'io^{I), and f{x) = oo for some x ^ I then f = °°, over I. D 

At times, we will need to talk about the individual affine functions, i.e., the pieces of a cost function. 
To make this easier, we introduce the following convention. Given an interval /, let /:/—;■ M be a cost 
function, we will write / = (/i , . . . ,/t) to denote the fact that the piecewise affine function / consists of 
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affine pieces fi,...,/^, with domains /i , . . . ,4, where k is the smallest integer such that 

7i(x) xG/f 



m 



Jk{x) x£l[. 



b^ ef 

"i 1*^; 



, for / S 



Throughout the paper, we will be implicitly assuming that / = [b,e\, and that 1] 

{ 1 , . . . , ^} , with ej^ J = ftj , for / G { 1 , . . . , ^ — 1 } . The formula for the individual segment /,• will be given 

by a{ • X + q , for / e { 1 , . . . , ^} . If / is clear from the context, we will omit the superscript. 

We now introduce two operators, which are key in defining the relationship between reachability 
cost functions of a location and its successors. This will be later summarized by Lemma [T3] Given a 
cost function / : [b,e\ — ;■ MU {oo} and a positive constant c, we define the following two operators, that 
transform cost functions: 

minC (/, c) = X I— )• min ct+f(x + t) 

and maxC(/,c) defined analogously, with max substituted for min. 

Lemma 3 Let c be a positive constant. If f '. [b,e] — )• M is a cost function then minC(/,c) is a cost 
functions as well. The same holds for maxC. 

The following Proposition formalizes the intuition how the minC (maxC) operators affect the func- 
tion /. The minC (maxC) operator removes all pieces of / that have slopes steeper (shallower) than — c, 
and substitutes them with pieces that have a slope equal to — c. For the remaining pieces, the formula 
remains unchanged, but the domain may change. However, the new domain is always a subset of the 
domain in /. 

Proposition 4 Let f = (/i , . . . ,/i) and let minC(/,c) = (^i , . . . ,^/). We have that I ^ k, and for every 
j ^ /.■ if the formulas for gj and /,• are equal, for some i ^ k, then bj ^ I^. C ij , otherwise a^- > —c. 
Moreover, a^- ^ —c, for all j = 1 , . . . , Z. 

For maxC we can prove a similar result, with the only difference that in the statement of Prop. |4] < 
and ^ are substituted for > and ^. 

Example 5 Fig. Ujgives an intuitive understanding of the minC (/, c) operator, for a cost function f and 
a positive real constant c. The cost function is given as f = (/i , . . . , /y ) (as seen in Fig. U\a)). The slope of 
/2 and f(, is smaller than —c;for the remaining components it is greater The cost function g = minC (/, c) 
is depicted in Fig. Ulb), and is given by (gi , . . . ,g6)- All components of g have a slope greater or equal to 
—c. The formula for g{ is the same asforfi, however, the domain is a subset (similarly for fn and g^). 
The function g^ is equal to fj, (similarly gg is equal to fj). Functions g2 and g5, have the slope —c, and 
where not present in f. D 



Reachability-price games. A reachability-price game is played on a transition system whose states 
are partitioned between two players, the minimizer and the maximizer. A game starts in some state, and 
the players change the current state according to the transition rules, with the owner of the state deciding 
which transition to take. The goal of the minimizer is to reach a state in the designated set of goal states, 
whereas the goal of the maximizer is to prevent this from happening. Each transition incurs a price, and 
the minimizer, if she can assure that a goal state is reached, wants to minimise the total price of doing so. 
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a) 



b) 



Figure 1: a) Cost function /, before applying minC(/,c) operator — dashed lines have a slope — c; b) 
cost function g = minC(/, c) — dashed lines denote parts of / that do not coincide with g. 



If the maximizer cannot prevent the minimizer from reaching a goal state, then his goal is to maximise 
the total price of reaching one. 

A weighted labeled transition system, or simply a transition system, S^ = {S,A, — >, k), consists of a 
set of states, S, a set of labels. A, a labeled transition relation — > C 5 x A x 5, and a price function 7i that 

assigns a real number to every transition. We will write s —^ s' to denote a transition, i.e., an element 
{s,X,s') £ — )■. We will say that the transition system is deterministic, if the relation — > can be viewed as 
a function — )• : 5 x A — )• 5. 

Remark 6 For the remainder of the paper we will be considering deterministic weighted labeled tran- 
sition systems. D 

Weighted transition systems will be used to provide the semantics for single-clock timed automata, 
considered in this paper. In this context, the restriction made in Rem. [6] is not constraining, as single- 
clock timed automata yield deterministic transition systems. We place this restriction because it allows 
for simpler definitions (e.g., we rely on this restriction when defining the notion of a run induced by the 
players strategies). 

A reachability-price game T = (|j7 5'Mm^^Max^^Goai yGoai^ consists of a weighted transition labeled 
system, ^, a partition of the transition system's set of states, into the minimizer and maximizer states, 
^Max ^ ^ \^ ^Min^ ^ designated set of goal states, 5°°^' C S, and goal cost function, /°°^' : 5°°^' -^ M. 



Given a state s, a run of the game from 5 is a (possibly infinite) sequence of transitions (O 

■^2, U T£i ,, ■^1, '^*\ J ,,/ / ''■1, 



SQ 



s\ -^ S2--- , where s = 5'o. If two runs (0 = sq^ ■■■ -h Sk and co' = s'^^ ■■■ are such that Sk = s'q then. 



h 



h 



K 



(0(o' denotes the run sq -^ ■■■ -^ s^ -^ ••■ . Given a finite run (O, mathrmLen{(o) will denote its length, 
i.e., the total number of transitions, mathrmLast{(o) will denote the final state of (O, i.e., s^athrmLen((o)' 
and (On will denote the prefix of CO of length n, where n ^ mathrmLen{(o) . The set of all runs of T is 
denoted by Runs. The set of all finite runs of T is denoted by Runsgn. Note that Runsfin C Runs. We will 
also write Runs(i') (Runsfin(5)) to denote the set of all runs (all finite runs) starting in a state s. 

A strategy of the minimizer is a partial function /i : Runsfin — )• A such that for every finite run (O, 

ending in a state of the minimizer, Last{co) > s'. We will say that /i is positional if it can be treated as 

a function /i : 5 — )• A. We will write Z'^'" and n'^'" to denote the sets of all and all positional strategies 
of the minimizer, respectively. The set of strategies for the maximizer is defined analogously. 

Given a run co' ending in a state sq, and a pair of strategies a G Z'^'" and X ^ Z'^^", we write 

Run{(o' , n , x) to denote the unique run CO € Runs(so) satisfying: if Si — ^ Si^\ is the (/+ l)-th transition. 
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of CO, then iJ.{(o' (Oi) = A,- if Si G S^™, otherwise, x{(o'(Oi) = A,. Note that, if /i and % are positional, then 
(o' is irrelevant. 

Given a finite run CO we define its price, Price(ft)), as ^'.™[ ™ ''"^'*'' 7r((s,_i , A,-,^',)), i.e., the total price 
of its transitions. Given a run CO G Runs, let Stop(a)) = min{/ : Si G 5*^°^'}. The cost of a run CO is defined 
as: 

Costf CO) = J'/^°'V«^/i'''«^«^K«stop(ffl))) + Price(wstop(ffl)) Stop(w) < oo, 
1 oo otherwise. 

We now define the function OptCost : S — )■ MU {oo}, which maps every state to the minimum cost of 
reaching a goal state that can be guaranteed by the minimizer. If the maximizer can prevent the minimizer 
from achieving a goal state, the cost is oo. The function is defined as: 

OptCost(5) = inf sup Cost(Run(5,/x,;^)). 

Finally, we introduce the notion of £-optimality, for £ > 0. We say that /i G Z'^'" is e-optimal, if 
sup^gj^Max Cost(Run(ft),/i,;t;)) ^ OptCost(maf /irmLa^f (co)) +£ for all CO G Runsfin. Given a strategy n G 
£Mm ^g g^y jj^^j ^ g Y^Max j^ e-optimal for jj., if Cost{Run{co,lJ.,x)) ^ OptCost{mathrmLast (co)) - £, 
for all ft) G RunSfin. 

The decision problem associated with reachability -price games is the following: 

Problem 7 Given a reachability-price game T, its state s, and a real constant c, determine whether 
OptCost(5) ^ c. n 

If =y^ or r are not clear from the context we will write OptCostp, Runsp, etc. 

Single-clock timed automata. In this paper we are considering timed automata with a single clock. 
We write X = {x} to denote the set containing the single clock x. A clock constraint is given by a closed 
interval with non-negative integer end points. We write ^(X) to denote the set of all clock constraints. 
A clock valuation is a function that assigns a non-negative real value to the clock x; Y = [X ^ K>o] 
denotes the set of all single clock valuations. A clock valuation v satisfies a clock constraint g G 3S{X) if 
v{x) G g, and this will be denoted by v ^ ^. We write vq to denote the jc i— ;■ valuation. For a valuation v 
and t G M^o the valuation v + t denotes the valuation x^^v{x) + t. 

A weighted single-clock timed automaton s^ = {L,E,ri,mathrmurg,K) consists of a finite set of 
locations, L, an edge relation, E (ILx ^{X) x 2^ x L, an invariant specification, T] : L — )• ^{X), an 
urgency mapping, mathrmurg : L — )■ {0, 1}, and weight function, n : LVJE — ;• N. 

We assume (without loss of generality fBl) that j# is clock-bounded, i.e., there exists a positive 
constant M such that v |= T] (Z) implies v(x) ^ M, for every location I. 

The size of the automaton, denoted by |i2/|, is the total number of bits needed to represent all of its 
components — constants are encoded in binary. 

The semantics of a timed automaton .i^ is given in terms of a deterministic weighted labeled tran- 
sition system =^ = (S'^jA^,— s-^jTT.j/). The set of states 5^ C L x ^^ is such that v |= 'f]{l) for every 
(Z, v) G 5.0/. The set of labels is given by A.^/ = £ U M>o- The transition relation, — >.^ admits a transition 

(Z,v) — > (Z',v') iff one of the following is true: 

Discrete transition X = {l,g,Z,l') G £, v |= g, and if Z = then v = V, otherwise v' = vq. 

Continuous transition A = f G M>o, mathrmurg{l) = 0, i.e., the location is non-urgent, for every t' G 
(0,?) we have v + ?' |=t](Z),Z = Z', andv' = v + ?. 



36 Two-Player Reachability-Price Games on Single-Clock Timed Automata 



Finally, the price function, 71^ ((/,v) — > {I' ,v')) is defined as 7r(A) if A £ E, and 7r(/) • A, otherwise. 

We will often abuse notation, and treat the state of the automaton as an element of L x M^o, and the 
clock valuation as a real variable. 

Remark 8 We only allow runs that do not admit infinitely many consecutive continuous transitions. Note 
that this requirement does not exclude Zeno runs, i.e., infinite runs whose total duration is finite. D 

Reachability-price games on single-clock timed automata. Fix a partition of the set of locations, 
L = /^Mm \^^Max^ jj^j^ jj^g minimizer and maximizer locations, the set of goal locations L^°^^ c L, and a 
function that assigns a cost function to every goal location, mathrmCF'^°^^ : L*-""^' — )• '^^{[0,M]), where 
M is the clock bound. We can define a reachability-price game on a single-clock timed automaton s^ , by 
defining a reachability-price game on its transition system J^. The reachability -price game To/ is given 
by (5>,S^*^S'^''^S^™^/°°^'), where: 5^^'" = Sr\ (L^'" x Y), 5^^" = S\S^''', 5°°^' = Sr\ (L°°^' x Y), 
and/C'°^'((/,x)) = mathrmCF^°''\l){x) for every state (/,x) e 5^™'. 

The size of the game, denoted by |r| , is the total number of bits needed to represent all of its compo- 
nents — constants are encoded in binary. 

Assumptions. We are going to place some restrictions on the structure of timed automata, which will 
allow us to concentrate on the essence of the problem. In their work, Bouyer et al. place the same 
restrictions, and argue that this is without loss of generality ||9|. In particular, their complexity results is 
stated only for the restricted automata. 

Consider an interval /, we will write .s^i, for some I-bounded timed automaton, i.e., an automaton 
whose transition system has the state space restricted to L x /, and for every /, the invariant is r\{l) HI. 
To obtain the classical automaton we need to take / = [0,°°]. 

We say that an automaton £/ is simple if it is [0, l]-bounded and for every discrete transition of its 
transition system, the reseting set is empty, i.e., for every e = {l,g,Z,l') £ E we have Z = 0. Notice that 
in simple timed- automata time always progresses. 

We have the following result regarding simple single-clock timed automata. 

Theorem 9 Problem ^for reachability-price games on single-clock timed automata is polynomially 
Turing reducible to the analogous problem on simple single-clock timed automata. 

We simplify the automaton further, by assuming that the price of every discrete transition is 0. This 
assumption allows for a clearer exposition, and is without loss of generality ||9l. A technique similar 
to that used to remove the resets can be employed. For simplicity, let c E N be the constant used in 
Problem[7] For every state s, we need to consider at most c copies of the slightly modified game F, which 
is played on a simple automaton with no prices on discrete transitions. Intuitively, the OptCost function 
for the /-th copy, gives the optimal cost of reaching goal, provided that at most / transitions with non-zero 
prices were executed. The OptCost function computed for the /-th copy is used to construct the (/+ l)-th 
copy. Each copy is treated independently, and although we might have to consider exponentially many, 
this does not increase the complexity as our algorithm is in EXPTIME. 

Simple timed automata admit three possible edge guards, namely [0,0], [1,1], and [0, 1] (recall that we 
are considering only closed intervals as clock constraints). The first kind does not allow for a continuous 
transition, prior to a discrete one, and it is satisfied only by finitely many states. As it will be visible in the 
proofs of Sec. [3j the value of the OptCost function for such states, due to the time progression property 
of simple timed automata, does not "affect" the values for the other states. Transitions with guards of this 
kind can be dealt with, in polynomial time, during post-processing. The effect of a discrete transition. 
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featuring a guard of the second kind, can be encoded using additional goal cost functions. Once again, 
proofs in Sec.[3]explain how this can be done. It is only the third kind of guards that cannot be dealt with 
by such simple means. In light of this, and to simplify the presentation, we assume that all transition 
guards are true. A similar approach was used in the work of Bouyer et al. Q. 

Remark 10 In the light of the assumptions made, it is natural to think ofE as a subset ofL x L. D 

We will also assume that from every state the cost of reaching a goal state is finite. In light of 



Rem 10 one can determine the set of states, from which the maximizer can prevent reaching goal, by 
determining the appropriate set of locations; this can be done in polynomial time. The real complexity 
lies in determining the optimal cost of reaching a goal state, given that the minimizer can ensure it. 

Operations. We will now define some simple algebraic operations that we will be performing on cost 
functions, and reachability-price games on simple timed automata. These operations will be used in the 
algorithm, presented in Sec. [3] 

Given two functions /i : /i — ;• R and ^ : /2 — )■ M, we will write /i>^ to denote the override operation 
on these two functions H, defined as {h>g){x) = h{x) if x € /i and g{x) if x G /i \/i. 

Fix an interval / C [0, 1], an automaton £/, and a reachability -price game F, on £/. Below we list 
three operations, that given a game F, produce a new game: 

T[mathrmurg{l) := 1] denotes the game F' obtained from F by changing the urgency mapping of jz/ so 
that I is an urgent location. 

p^^Goai y 11^-^ denotes the game obtained from F by adding I to the set of goal locations, with h being 
the cost function assigned to I. It gives the game F', obtained from F, by setting L*^""' = L*^°^' U 
{/}, and defining the mapping from goal locations to cost functions, mathrmCF^°^^ , as (Z i— ;■ 
h) [>mathrmCF^°^^ . Function /j : / — )• M is a cost function, and / is a location. We do not require 
Z G L, i.e., I can be a fresh location. 

F[£' U e] the game obtained from F by adding an additional edge e in the automaton s^ . The new edge 
set is equal to £ U {e}, where e e LxL. 

3 Results 

We are interested in solving reachability -price games algorithmically. To solve a reachability -price game 
F means to compute the OptCost function. In this section we present an algorithm for computing this 
function. We start by introducing some preliminary notions, then we present the algorithm, and to 
conclude this section we provide a proof of its correctness. The algorithm extends the work of Bouyer 
et al. 0. With each recursive call, it attempts to solve a game with one less non-urgent location. The 
problem is polynomial-time solvable, when only urgent locations are present [9|. 

In the following we will be considering a game F, and games derived from it, F' and F". Furthermore, 
due to the iterative nature of our algorithm, we will often restrict the game to an interval, /. To ensure 
clarity, we will be writing OptCost^^ to explicitly indicate the game F and the interval /, to which the 
function refers. Unlike clock constraints, the interval / will usually have rational endpoints. 

At times, it will be convenient to treat OptCost as an element of [L — ;■ "^^(Z)], rather than an element 
of [5 — ;■ M]. We will therefore abuse the notation, and write OptCost(/) to denote the function x i— ;■ 
OptCost(/,x). 
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To make handling of non-urgent locations easy, we introduce the following definition: 

NonUrgent(r) = {1 : I eL\L^°''^ and mathrmurg{l) = 0} 

Fix a game r and two intervals /i = [b\,e\\,l2 = [^2,^2] suchthatr = ei =b2. We would like to have 
/ay of computing OptCostp^ ^^ , 
we define the following operation: 



a way of computing OptCostp , provided that we have already computed OptCostp . To enable this 



CostConsistent(r/,,OptCostr^ ] 



r [L°™' U Z'l , a: ^ OptCostp^^ (Zi , r) + (r - ;c) 7r(/i )][£ U (/i ,/;)].. . 

[L^"^' U /[, ^ ^ OptCostr, {h.r) + {r- x)K{h)\ [E U (/,,/[; 



h 



where {l\,. .. Jk] = NonUrgent(r). 

The intuition behind the CostConsistent operation is as follows. In F/, , the time cannot progress 
past e\, whereas in F/^u/j it can; this results in OptCostp^ being unrelated to (OptCostp^ ^ )|lx// > ^1" 
though, due to the lack of resets, OptCostp^ is equal to (OptCostp^ ^ )|lx/; ■ To alleviate this, for every 
non-urgent location /, we add a new goal location /' whose cost functions encodes the following behav- 
ior: upon entering / wait until time e\, and then reach goal, from the state (/,ei), "optimally" as if F/^ 
was the game being played. This intuition is formalized by the following lemma. 

Lemma 11 IfT\^ = CostConsistent (F/j,OptCostp^ ) then 

OptCostp/ >OptCostp^^ =OptCostr^^^ . 

We will also be considering situations where we have already computed OptCostp(/) for some loca- 
tion / of F, and we will want to use this fact to compute OptCostp for the remaining locations. 

Lemma 12 Given a game F over an interval I, a location I, and a cost function /!:/—)• R, if h{x) = 
OptCostp (/,x) /or every x £ I then 

OptCoStp[^Goalu; ;,](/', X) = OptCoStp(Z',x) , 

for every location I' £ L and every clock valuation x G /. 



Lemma 12 is a direct consequence of the following Lemma, which characterizes the relation between 



the values of optimal reachability cost of adjacent locations. 

Lemma 13 Given I € L'^'", let h{x) = min{OptCost(Z',x) : (/,/') e E}, we then have 

OptCost{l) =mmC{h,n{l)) 
If I £ /.i^a" then, if we substitute max for min and maxC /or minC, the same equality holds. 
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Algorithm. We will define a recursive function SolveRP that solves a reachability-price game F/, 
where / = [b,e] C [0,1] and the automaton underlying F is simple. Upon termination, the function 
outputs OptCostp . 

The algorithm works recursively, with respect to the set of non-urgent locations. During each recur- 
sive call, it identifies a non-urgent location that minimises the weight function. There are two cases to 
consider, depending on the ownership of the location, however, both of them are handled in a similar 
fashion. The algorithm modifies the game F to have one less non-urgent location. In case / G L'^^", we 
convert / to be urgent, whereas if Z G L'^'", we convert / to be a goal location that captures the following 
behaviour: once I is reached in F, the minimizer spends all available time there. The intuition behind 
this is as follows: if Z S L'^^", it is unlikely that spending time in that location will be beneficial for the 
maximizer. Likewise, when / G L^^'", it is likely that it will be beneficial for the minimizer to stay as long 
as possible. There are cases, however, when this intuition is incorrect, i.e., it is beneficial, respectively, 
for the maximizer to wait, and for the minimizer to move immediately. This necessitates the iterative 
procedure, outlined in the following, employed during each recursive call. 

The working assumption is that OptCost is, locationwise, a cost function. During each recursive call, 
the algorithm iteratively computes the result of the minC (maxC) operator applied to the minimum (max- 
imum) of the location's successor's cost functions (that are equal to OptCost). The iterative procedures 
in cases 2 and 3 of the algorithm compute the solution over a sequence of intervals, proceeding from 
the left to the right of the time axis. They first assume that the aforementioned intuition is correct (step 
1), and then identify the rightmost interval, over which it is not (step 2). The next step is to adjust the 
solution over that interval (step 2 and 3). It remains to find the solution to the left of the found interval. 
This is done in the subsequent iterations. 

We now present the recursive algorithm SolveRP(F/). There are three cases to consider. 

First case: NonUrgent(F/) = 0. OptCost(Z) is a cost function (for every location I) and can be 
computed by solving a finite game in polynomial time. If mathrmCF'^°^^ has p pieces in total, then 
OptCost has at most 2p pieces [9] . 

Second case: L'^^" 3 1* = argmin{;r(Z) : / G NonUrgent(F)}. In Case 2 of the algorithm, an iterative 
procedure is applied to compute OptCostp over the interval / = [b,e\; in each iteration, the computation 
is restricted to the interval [b,r\, with r = e in the first iteration. First, in Step 1, a game F' with one 
less non-urgent is constructed. We obtain F' from F by making l* an urgent location — this captures 
the intuition that, since l* minimizes the weight function, it is beneficial for the maximizer to leave l* 
immediately. Second, in Step 2, the procedure identifies the rightmost interval over which the function 
/ = OptCostp/ (/*), computed in Step 1, has an affine piece with the slope strictly shallower than — 7r(Z*); 
the affine piece and the interval are denoted by fi and [Z?,-,^,], respectively. Third, in Step 2, a new game, 
F", is constructed; we are considering this game over the interval [bj^ei]. Like F', the game F" has one 
less non-urgent location than the game F; it is obtained from F by turning /* into a goal location with 
the cost function h = —n{l*){r — x) + OptCostr Xl*,r) assigned to /* — this cost function captures 
the behaviour contrary to the previously considered intuition, i.e., that, upon entering /*, the maximizer 
spends all available time there. The game F" is used to adjust the solution, to account for states from 
which the intuition that leaving /* immediately is beneficial to the maximizer is incorrect. The slope of 
/; is shallower than —7i{l*), and since I* minimises the weight function, this means that fi is actually 
an affine piece of one of the cost functions assigned to goal locations in the game F. Finally, in Step 3, 
OptCostp over [bi,e] is being established. It is equal to OptCostp over the interval [e,, r] and to OptCostpw, 
over the interval [^,,e,]. The algorithm then proceeds to the next iteration by setting r = bf, the iterative 
procedure is completed when bt = b. The CostConsistent operation is used to assure consistency of 
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solutions between subsequent iterations. 
The procedure is as follows: 

1 . Assuming that we have computed OptCostp , for some r G /, we set 

r|^ 1 = ( CostConsistent ( Fj^ ^] , OptCostp j j [mathrmurgQ*) := 1] , 



and we compute OptCostp/ = SolveRP f R , j . Let / = (/i , . . . , /i-) = OptCostp/ {I* ) . 

2. Let / be the smallest natural number such that aj > —7i{l*) and for all j > i we have ff- ^ —k{1*). 
If / > 0, then we define /i : // -> M as -n{l*){e{ -x)+fi{e{), and 

r" ,. ^f = CostConsistent ( Y'.f n , OptCostp/ , j [L*^"""' U /* , /z] , 

and compute OptCostp = SolveRP (Fn/^/ 

3. We set 

OptCostr =OptCostr'' C>OptCostr/ I>OptCostr . 

If / = the F" term is omitted. 

We setr = b-. Ifr^b then goto 1 otherwise output OptCostp^. 

We initialize the procedure by solving the game T'j = Yi[mathrmurg{r) := 1], and setting r = e. Observe 
that OptCostp, , = OptCost^' . 

Third and last case: L'^'" 3 I* = argmin{7r(/) : / G NonUrgent(F)}. In Case 3 of the algorithm, 
an iterative procedure is applied to compute OptCostp over the interval / = [b,e]; in each iteration, the 
computation is restricted to the interval [b,r\, with r = e in the first iteration. First, in Step 1, a game 
F' with one less non-urgent is constructed. We obtain F' from F by making I* a goal location; the cost 
function, h, assigned to I* captures the following behavior, once /* is reached, the minimizer chooses to 
spend all available time there. Second, in Step 2, the procedure identifies the rightmost interval, over 
which the function /, computed in Step 1, is strictly smaller than h for at least one argument; the interval 
corresponds to an affine segment of /, denoted by /,. The argument, for which the functions / and h 
are equal, is denoted by x* € /,■ — such an argument always exists as /(r) =h{r), by definition. Third, 
in Step 2, a new game, F", is constructed. Like F', the game F" is obtained from F by turning l* into 
a goal location. In this case, however, the cost function assigned to /* is /,, and the game is being 
considered over the interval [bi,x*] — the cost function /,• captures the intuition that it is beneficial to 
leave /* immediately. The game F" is used to adjust the solution, to account for states from which the 
intuition that spending all available time in /* is beneficial to the minimizer is incorrect. The slope of/, is 
shallower than that of /j, which is equal to —7i{l*), and since /* minimises the weight function, this means 
that fi is actually an affine piece of one of the cost functions assigned to goal locations in the game F. 
Finally, in Step 3, OptCostp over [bi,e] is being established. It is equal to OptCostp over the interval 
[x*,r\ and to OptCostp//, over the interval [^;,x*]. The algorithm then proceeds to the next iteration by 
setting r = bf, the iterative procedure is completed when bi = b. The CostConsistent operation is used 
to assure consistency of solutions between subsequent iterations. 

The procedure is as follows: 
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1. Assuming that we have computed OptCostp , for some r G /, let /j : [Zj,r] — ;■ M be defined as 
h{x) = -7r(/*)(r-;c)+OptCostr (/*,r), we set 

r'^,.j = ('costCoiisistent('r[/,,^],OptCostr|^^j]] [l'-'™' U /* , /i] 



and compute OptCostp/ = SolveRP ( F',, , ) . We define / = (/i , . . . ,/,t) as minjOptCostp/ (Z) : 

{l*,l)eE}. 

2. Let / be the smallest natural number such that for all j > i we have /(x) ^ h{x) over [b -,€■]. If 
/ > letx* denote the solution of /(a) = h{x) (over [bj ,ej]). We then set 

rj;,.^,j = CostCoiisistent(r'j^,.^,j,OptCostr|^^^|) [L°™'u/*,/;] 

and compute OptCostp// = SolveRP IT'', \. 

3. We set 

OptCostp ^ =OptCostr« >OptCostr/^ i>OptCostr 

If / = then the F" term is omitted. 

We set r = bi. Ifr^b then gotofTl otherwise output OptCostp^. 

We initialize the procedure by solving the game Erg ^j , and setting r = e. Observe that this can be done in 
polynomial time. 

The following example provides the intuition behind the iterative procedure employed during each 
recursive call of the algorithm. 

Example 14 Fig. |2] shows how the iterative procedure in Case 3 of the algorithm works to compute 
OptCostp over the interval I = [b,e\. In diagram a) we can see that OptCostp has been computed over 
the interval [r, e]. The function h denotes the cost function assigned to I* in T' and the dashed line denotes 
the function f, as defined in Step 1 of Case 3. One can see that the interval [bi,ej] and x*, identified in 
Step 2 of Case 3 of the algorithm, are such that: over the interval [x* , r] the intuition, which indicates 
that the minimizer should spend all the available time in I*, is correct; and that over the interval [bi^x*], 
this intuition is not correct, i.e., it is beneficial for the minimizer to leave I* immediately — the dashed 
and bold segment of f denotes the cost function assigned to I* in F". In diagram b) we can see the next 
iteration of the algorithm. OptCostp has been computed over [r' = bi,e]; this iteration follows the same 
steps as the previous one. Note that, over the interval [bi,r], OptCostp(Z) is equal to h, over the interval 
[x*, r] and to f, over the interval \bi,x*], as defined in Step 3 of Case 3 of the algorithm. D 

Correctness and complexity. We show that the procedure SolveRP is correct, i.e., that if it terminates, 
the output is in fact the OptCost function, and that it indeed terminates. We will also show, that there is 
an exponential upper bound on the running time of SolveRP. The main result of this paper is as follows: 

Theorem 15 Given a reachability-price game F, the function SolveRP(r) terminates and outputs the 
function OptCostp. 

We will prove the Theorem in two steps. First, we prove that if the iterative procedure in cases 2 and 
3 terminates, it computes OptCost. Second, we show that it always terminates. 
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a) 



b) 



Figure 2: Iterative computation of OptCostp(Z*) over the interval / = [b,e], where /* G L^™. Diagrams 
a) and b) depict two subsequent iterations; the gray rectangle indicates the subinterval for which the 
OptCostr(/*) function is being computed during the given iteration. 



Theorem 16 Given a reachability-price game T, if SolveRP(r) terminates, it outputs the function 
OptCostr- 

Proof. The proof is inductive. Fix F, and let I be the non-urgent location that minimizes 7i{l). Assume 



that F has n + l non-urgent locations, and that Theorem 16 holds for every game F' that has at most n 
non-urgent locations. 

If there are no non-urgent locations, then computing OptCostp amounts to solving a reachability- 
price game on a finite graph. It remains to prove the inductive step; there are two cases to consider. The 
first case, when / G L^™ and the second case when I G L'^^". The proofs of these two cases follows from 
Lemmas [17] and [HI D 

Lemma 17 Given a reachability-price game F with the price-rate minimizing location in l]^^^, if Case 
2 o/SolveRP terminates, it outputs OptCostp. 



Proof. Case 2 is handled in the same way as in Bouyer's et al. algorithm ||9l- 



D 



Lemma 18 Given a reachability -price game F with the price-rate minimizing location in L*^'", ;/ Case 
3 o/SolveRP terminates, it outputs OptCostp. 

Proof. Without loss of generality we assume that we are dealing with a single interval /. Let /*, F', F", 
/, and a;* G / be defined as in the Case 3 of the algorithm. 

We will show that OptCostp(/*) = OptCostr(/*) over the interval [x*,e] and that OptCostp,(/*) = 



OptCostr(Z*) over the interval [^,x*]. This, together with Lemmas 1 1 and 12 enables us to establish that 



the procedure for computing OptCostp over / in Case 3 is correct. By the inductive hypothesis, we can 
solve F' and F" — these games have one less non-urgent location than F. Additionally, in games F' and 
F" we restrict the minimizer so we immediately have: 

OptCostp////(Z)(x) ^ OptCostp(/)(.x), 

for every / G L and every x G [x* , g] / [^, x*] . By definition of F', there is an equality for x = e.To complete 
the proof, we need to show the reverse inequality. 
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There are two cases to consider. We start with the first case, i.e., we will show that OptCostp(Z*) ^ 
OptCostr(Z*) over [x*,e]. 

Fix e > and a strategy jj. € T'^'" (note that F and F' admit the same sets of strategies). We take Xe S 
£Max j-j^^j- jg ^-optimal for jj. in F'. For every s G {/*} x [x*,e] let (0 denote the unique run Run{s,n,Xe), 
and lets assume it visits /* exactly m times, after transitions ii,...,i,„. We assume, without loss of 
generality, that after every such transition, a continuous one is taken. We then have: 

Costr(to) = Price(co,„) + A,„,+i-7r(/*)+Costr(Run(co,„,+i,Ai,;i:e)) 

^ Price(co,„) + A,„,+i • 7r(Z*) +OptCostr/(5,„,+i) -£ 

^ Price( w,i ) + (x,„, + A,„,+i - xi^ )-n{l*)+ OptCostr(5,„,+i ) - £ 

^ Price(a),-i ) + (e - x,; ) • 7r(/*) + OptCostr((/* ,e))-e 

= OptCostp,(5)-e, 

where Si = {li,Xi). The first inequality holds because Xe is £-optimal for jj. in F', and because the suf- 
fix of CO starting in Si^ does not visit I* but for the first transition, which comes at a price zero. The 
second inequality holds because /* minimizes 7i{l*). The third inequality follows from the definition of 
OptCostp/(Z*). Finally, mathrmLen{(Oi^) = 0, hence Cost(G),-, ) = 0. 

We now proceed to the second case, i.e., that OptCostp/(/*) ^ OptCostp(/*) over [^,x*]. The fact 
that OptCostp/(/*) ^ OptCostp(Z*) over [b,x*] impUes that the slope of / is shallower than —k{1*). 

Fix £ > and /i e l'^'" in F. Let Xe S I^"" be a strategy that is £-optimal for /i in F" (the sets of 
strategies in F and F" are equal). For every s G {/} x [Z7,x*], let (O denote the unique run Run(i',/i,;^e), 
which pays m visits to /* (the notation and assumptions are the same as in the first case). We then have: 

Costr(to) = Price(co;,J + A,v„+i •7r(/*)+Costr(Run(w;,„+i,;U,;i:e/)) 

^ Price(w,-,J + A,-,„+i • 7r(Z*) +OptCostr„(5;,„+i,Ai,Ze')) - £ 
^ Price(a),i ) + (x;„, + A;,„+i - x;, ) • 7r(/* ) + OptCostp, (5,„,+i ) - £ 
^ Price( W/( ) + OptCostp/ {si^ ) - e 

= OptCoStp/(s;|)-£. 

The first inequality holds because (O does not feature transitions ending in /*, modulo its prefix W(„,+i, 
and because Xe is £-optimal for jj. in F". The second inequality holds because I* minimizes 7i{l*). The 
final inequality holds because the slope of / is shallower than —7l{l*) over [Zj,x*]. This finishes the proof 
of the theorem. D 

We have proved that the algorithm is partially correct. It remains to prove its total correctness, i.e., 
that it terminates. 

Theorem 19 The algorithm SolveRP terminates. 

Proof. To prove termination of the algorithm, we need to prove the termination of the iterative proce- 
dures from cases 2 and 3 of the algorithm. 

Each of the two cases is different, however, they have one thing in common. In each iteration, in 
both cases, an interval with slope shallower than —n{l*) is processed. Since I* minimizes 7i{l*), and 
by Lemmas [3] and 13 this interval corresponds to a segment of a goal cost function. We argue that the 



number of iterations in each case is bounded by the number of all the possible intersections of the cost 
functions assigned to goal locations, which is finite. 
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More precisely, let V, I, f and / be defined as in Case 2 of the algorithm. The slope of/, over /, is 
shallower than —n{l*), so by Lemmaslsland 13 fi coincides with some cost function over /,• — denoted 



by g. If / > 1, then there are two possibilities, either bj coincides with an intersection of cost functions 
from two different goal locations, or otherwise /,_i has the slope equal to —7t{l) for some non-goal 
location I. 

In the first case, the iteration must have moved past one of the finitely many intersection points. In 
the second case, we need to argue that if the procedure once again encounters the same affine piece of g 
(but over a different interval), then it must have also passed at least one of the finitely many intersection 
points. Let /' = [b' ,e'] C [b,bi) denote the interval over which the procedure encounters g again. Assume 
that the procedure did not pass any intersection points before /'. This implies that OptCostp/(/*) ^ g over 
[e' ,bi], and hence OptCostp/(Z*) must contain an affine piece that has a slope shallower than g over [e',z\, 
for some z G ie',bi\. However, such a piece coincides with a goal cost function, so an intersection point 
must have been passed. 

So far we have shown termination of the iterative procedure in Case 2. It remains to show the same 
for Case 3. Let V, f, i, and x* be defined as in Case 3 of the algorithm. We argue that each affine segment 
of a goal cost function is processed only once. If / > 1 , then /,_ i either coincides with a different piece 
of a goal cost function, which means that we have passed one of the finitely many intersection points and 
the /,• segment has been processed, or its slope is equal to —7i{l), for some non-goal location /. In the 
latter case, we have that OptCostp(Z*) has a slope steeper than /,-, and hence, it is strictly greater than /, 
over [b,bi). This means that in the subsequent iterations, if ft is to be encountered, a piece with a slope 
smaller than that of /, needs to occur, but this means that an intersection point has been processed. 

We have shown that in each step of the algorithm the iterative procedure of Cases 2 and 3 terminates, 
and hence, the algorithm terminates. D 

We have proved that the procedure SolveRP is correct. The question that remains, is its complexity. 
We have the following result. 

Theorem 20 The algorithm SolveRP is in EXPTIME. 

Proof. Given an automaton £/, let n denote the number of non-urgent locations and p the total number 
of pieces in mathrmCF^°^^ . The complexity of computing the solution, using SolveRP, depends on the 
number of pieces that constitute OptCostp. Let N{n,p) denote the upper bound on the number of pieces 
in OptCostp. We now construct a recursion to characterise N{n,p). 



If « = 0, then N{n,p) = Ip [9|. If « > 0, both cases take p (as argued in the proof of Theorem 19 1 
iterations, and each requires solving two games with the solution complexity equal to N{n — \,p + n) 
and N{n — 2,p + n — 1). We can assume that p > n is the case of real interest, so we have N{n,p) ^ 

2pN{n — \,2p). It can be easily verified that: N{n,p) ^ 2 2 p"^^. This establishes that SolveRP is 
indeed in EXPTIME. D 

Discussion. We now briefly compare our algorithm with that of Bouyer et al. The 3EXPTIME al- 
gorithm, introduced by Bouyer et al. [9|, differs from the one presented in this paper in the way the 
minimizer locations are handled. As was explained above, the algorithm presented in this paper uses 
an iterative procedure, similar in spirit to that used for handling maximizer locations. The 3EXPTIME 
algorithm, on the other hand, exploited the following observation: if the location, I, which minimizes 
the weight function, is visited several times, then the minimizer would not be worse off, if, upon the first 
visit, she had waited the whole time that passes between the first and last visit — this is valid because 
all other locations have a higher value of the weight function. This intuition is formally captured by the 
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algorithm in the following way. Two copies of the original automaton are created, with the only differ- 
ence that in both of them / becomes a goal location, and hence both automata have one less non-urgent 
location — this duplication introduces an exponential blowup in complexity. The first automaton cap- 
tures the behavior before / is entered for the first time, whereas the second copy captures the behaviour 
afterwords. In the second copy, I is transformed into a goal location with a cost function equivalent to 
positive infinity — this captures the intuition, that it is sufficient to visit / only once. The algorithm first 
computes OptCost for the second copy, then, using that result, computes OptCost, for all states having / 
as the location (this is in fact a game with a single non-urgent location), and finally computes OptCost 
for the first copy, with OptCost, computed in the previous step, being assigned as a cost function to I. 
OptCost computed for the first copy is the sought solution. The second exponential blowup originated 
from the construction that allowed to assume that the clock value is bound by 1 . The construction used by 
Bouyer et al. ||9l used locations to encode the integer part of the clock value, and the clock itself captured 
only the fractional part of the clock value — this yielded an exponential blowup, as there had to be a copy 
of the original location for every integer value between and the value of the largest constant provided in 
the definition of the automaton (recall, that constants are encoded in binary). Our construction, presented 
in Sec.[2j avoids this blowup. 
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